Markov Decision Problems Where Means Bound Variances

نویسندگان

  • Alessandro Arlotto
  • Noah Gans
  • J. Michael Steele
چکیده

We identify a rich class of finite-horizon Markov decision problems (MDPs) for which the variance of the optimal total reward can be bounded by a simple affine function of its expected value. The class is characterized by three natural properties: reward boundedness, existence of a do-nothing action, and optimal action monotonicity. These properties are commonly present and typically easy to check. Implications of the class properties and of the variance bound are illustrated by examples of MDPs from operations research, operations management, financial engineering, and combinatorial optimization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Daily Exchange Rate Behaviour and Hedging of Currency Risk

Exchange rates typically exhibit time-varying patterns in both means and variances. The histograms of such series indicate heavy tails. In this paper we construct models which enable a decision-maker to analyze the implications of such time series patterns for currency risk management. Our approach is Bayesian where extensive use is made of Markov chain Monte Carlo methods. The e ects of severa...

متن کامل

Second Order Optimality in Transient and Discounted Markov Decision Chains

Abstract. The article is devoted to second order optimality in Markov decision processes. Attention is primarily focused on the reward variance for discounted models and undiscounted transient models (i.e. where the spectral radius of the transition probability matrix is less then unity). Considering the second order optimality criteria means that in the class of policies maximizing (or minimiz...

متن کامل

SERIE RESEARCH mEmORRnDfl TRUNCATION OF MARKOV DECISION PROBLEMS WITH A QÜEUEING NETWORK OVERFLOW CONTROL APPLICATION by

Conditions are provided to conclude an error bound for truncations and perturbations of Markov decision problems. Both the average and finite horizon case are covered. The results are illustrated by a truncation of a Jacksonian queueing network with overflow control. An explicit error bound for this example is obtained. Key-words Markov Decision Problems * Truncation * Perturbation * Error Boun...

متن کامل

Low Decision Space Means No Decentralization in Fiji; Comment on “Decentralisation of Health Services in Fiji: A Decision Space Analysis”

Mohammed, North, and Ashton find that decentralization in Fiji shifted health-sector workloads from tertiary hospitals to peripheral health centres, but with little transfer of administrative authority from the centre. Decisionmaking in five functional areas analysed remains highly centralized. They surmise that the benefits of decentralization in terms of services and outcomes will be limited....

متن کامل

Exploiting Similarity Information in Reinforcement Learning - Similarity Models for Multi-Armed Bandits and MDPs

This paper considers reinforcement learning problems with additional similarity information. We start with the simple setting of multi-armed bandits in which the learner knows for each arm its color, where it is assumed that arms of the same color have close mean rewards. An algorithm is presented that shows that this color information can be used to improve the dependency of online regret boun...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Operations Research

دوره 62  شماره 

صفحات  -

تاریخ انتشار 2014